two-way consistency
385822e359afa26d52b5b286226f2cea-Paper.pdf
In contrast, classical graphical methods like A* search are able to solve long-horizon tasks, but assume that the state space is abstracted away from raw sensory input. Recent works have attempted to combine the strengths of deep learning and classical planning; however, dominant methods in this domain are stillquite brittle andscale poorly withthesizeoftheenvironment.
Learning state abstractions for long-horizon planning
Many tasks that we do on a regular basis, such as navigating a city, cooking a meal, or loading a dishwasher, require planning over extended periods of time. Accomplishing these tasks may seem simple to us; however, reasoning over long time horizons remains a major challenge for today's Reinforcement Learning (RL) algorithms. While unable to plan over long horizons, deep RL algorithms excel at learning policies for short horizon tasks, such as robotic grasping, directly from pixels. At the same time, classical planning methods such as Dijkstra's algorithm and A search can plan over long time horizons, but they require hand-specified or task-specific abstract representations of the environment as input. To achieve the best of both worlds, state-of-the-art visual navigation methods have applied classical search methods to learned graphs.
Learning State Abstractions for Long-Horizon Planning
Many tasks that we do on a regular basis, such as navigating a city, cooking a meal, or loading a dishwasher, require planning over extended periods of time. Accomplishing these tasks may seem simple to us; however, reasoning over long time horizons remains a major challenge for today's Reinforcement Learning (RL) algorithms. While unable to plan over long horizons, deep RL algorithms excel at learning policies for short horizon tasks, such as robotic grasping, directly from pixels. At the same time, classical planning methods such as Dijkstra's algorithm and A$ *$ search can plan over long time horizons, but they require hand-specified or task-specific abstract representations of the environment as input. To achieve the best of both worlds, state-of-the-art visual navigation methods have applied classical search methods to learned graphs.